
STOP 



Early Journal Content on JSTOR, Free to Anyone in the World 

This article is one of nearly 500,000 scholarly works digitized and made freely available to everyone in 
the world by JSTOR. 

Known as the Early Journal Content, this set of works include research articles, news, letters, and other 
writings published in more than 200 of the oldest leading academic journals. The works date from the 
mid-seventeenth to the early twentieth centuries. 

We encourage people to read and share the Early Journal Content openly and to tell others that this 
resource exists. People may post this content online or redistribute in any way for non-commercial 
purposes. 

Read more about Early Journal Content at http://about.jstor.org/participate-jstor/individuals/early- 
journal-content . 



JSTOR is a digital library of academic journals, books, and primary source objects. JSTOR helps people 
discover, use, and build upon a wide range of content through a powerful research and teaching 
platform, and preserves this content for future generations. JSTOR is part of ITHAKA, a not-for-profit 
organization that also includes Ithaka S+R and Portico. For more information about JSTOR, please 
contact support@jstor.org. 



WEIGHING THE SCALES 



ALLEN CROSS 
State Teachers College, Greeley, Colorado 



Within five years some remarkable weighing-contrivances have 
been exhibited to the public. Twenty-five years ago all we heard 
about scales was contained in the printed advertisements of Jones 
of Binghamton — the man who paid the freight on five-ton wagon 
scales. Such weighing-machines are gross and common now that 
pedagogical scales have been invented. Place a child in contact 
with one of these new devices, and behold! you may know to a 
fraction of a decimal how much arithmetic and grammar he con- 
tains at the moment. You may learn what his capacity is for 
spelling, writing, reading, drawing, and a number of the high-school 
subjects. It is my intention to discuss the value of these scales 
as schoolroom apparatus and to make some comment upon their 
accuracy in doing the work for which they are designed. 

I do not expect to set forth anything new in this field. More 
than I can possibly include in this paper is at the disposal of those 
who will run a long way and read with considerable diligence the 
complete exposition of these matters in the Journal of Educational 
Psychology, Teachers College Record, Pedagogical Seminary, and 
such magazines. But only the specialist follows these investiga- 
tions in detail while they are in the process of being brought forth. 
The teacher of English or the classroom elementary teacher has 
neither the time nor, in most cases, the inclination to follow up 
these matters until they become fairly well reduced past the 
experimental and down to the working stage. 

Relying upon the supposition that many teachers are too busy 
to read all that is written about scales, but that they would be 
glad to have someone else read and summarize for them, I am 
undertaking to weigh the scales (may they not be found like the 
false balance of Proverbs — an abomination to the Lord). 

183 



184 THE ENGLISH JOURNAL 

The fundamental theory of educational scales is that a standard 
of expectancy may be determined by taking an average from a 
large number of responses made by pupils of a given age and grade. 
The simplest of them all and the earliest working scale is the Ayres 
scale for testing handwriting. An explanation of this may be used 
as an introduction to the others. 

Handwriting is good if it is rapid and legible. A scale of the 
various grades of handwriting could be made by having a large 
number of children of all ages write for a given number of seconds 
or minutes, producing as many readable words as they could in 
the given time. Now let these be tested by a group of readers for 
legibility by taking the average time for reading a given number of 
words. Then arrange the selected samples in the order of their 
legibility, and you will have a rough working scale. There are a 
number of refinements to be considered before you have a complete 
scientific scale based upon both speed and legibility, but what I 
have given is the working parts of the devices perfected by Ayres 
and Thorndike. The Thorndike scale is made up of samples of 
writing representing qualities from 4 to 18. The Ayres scale has 
similar samples corresponding to the qualities 7-14 of the other 
scale. But Ayres numbers his qualities 20, 30, 40, 50, 60, 70, 80, 
and 90. 

To test the quality of the writing of any child is a simple matter. 
Let him write a sentence over and over as many times as possible 
in two minutes. Then take the sample and move it up and down 
the chart that contains the scale types. You may find it better 
than the 10 quality of the Thorndike scale (or the 50 of the Ayres), 
but not so good as the 11 (or 60). Then calculate the speed, and 
compare it with the average for that quality and you will have the 
sample evaluated. Having done this for a pupil on September 1, 
you can repeat the test on October 1, November 1, and so on at 
some regular interval through the year. In this way the teacher 
is able to know accurately whether the pupil is making progress 
and how much. She may even set a standard for her grade — a 
standard toward which to strive, at least. Since writing is a purely 
mechanical process, the scales are satisfactory means of judging 
attainment and progress. What a joy for us who teach English, 



WEIGHING THE SCALES 185 

or who teach general grade subjects, if we could apply similar 
tests to all pupils in all subjects. If we could do that, there would 
no longer be any doubt about measuring the progress of our pupils. 
Is Fred to be promoted to the seventh grade at Christmas ? Let 
us see. Imagine that the standards to be reached by the sixth 
grade are: spelling, quality 13; writing, quality 12; language, 
quality 15; arithmetic, quality 14; geography, quality 15; history, 
quality 12; drawing; woodwork; etc. Put Fred on the scales 
one after another. He is slightly above in some of the subjects, 
but below in a few of the manual subjects. Let him pass into the 
seventh grade. 

Unfortunately the problem is not so simple as this. It is very 
difficult to devise scales for the less mechanical subjects. It may 
even be impossible for some subjects. Thus far, however, scales 
have been constructed to test pupils in reading, writing, spelling, 
grammar, composition, arithmetic, and drawing. Some scales have 
been made also for high-school Latin, German, French, and 
physics. 

The English teacher in the elementary school needs scales for 
reading, spelling, grammar, composition (oral and written), and 
literary appreciation. 

Thus far these have been provided for all except oral composition 
and literary appreciation. The scales for phases of English pub- 
lished to date are the following: 

1. Reading: 

(1) A system of tests devised by E. L. Thorndike. 

(2) A system of tests devised by D. Starch. 

(3) A system of tests devised by F. J. Kelly. 

(4) A system of tests devised by W. S. Gray. 

2. Spelling: 

(1) A system of tests devised by L. P. Ayres. 

(2) A system of tests devised by B. R. Buckingham. 

(3) A system of tests devised by D. Starch. 

3. Grammar: 

(1) A system of tests devised by D. Starch. 

4. Composition: 

(1) A system of tests devised by M. B. Hillegas. 

(2) A system of tests devised by F. W. Ballou. 

(3) A system of tests devised by S. A. Courtis. 



186 THE ENGLISH JOURNAL 

The remainder of the paper will be an explanation of the way 
in which the scales are made and applied, with some comments in 
conclusion about their applicability and their value to teachers. 

THE READING SCALES 

We read mainly to comprehend meaning for ourselves. Accu- 
racy of comprehension and speed are the two important elements 
to be tested. Speed is tested in each step of the tests by deter- 
mining the number of words read in a given number of seconds. 
Comprehension is judged by having the pupil write after he has 
finished reading a sentence or a paragraph. 

The Starch reading scale is made up of nine paragraphs increas- 
ing in difficulty in a fairly uniform gradation. Large type is used for 
the lower steps. The child reads one of these paragraphs for exactly 
thirty seconds. The number of words he reads will determine his 
speed. He then turns the blank over and writes on the back 
of the sheet all he can remember of the substance (the meaning) 
of the paragraph, taking all the time he needs. This written 
account is then read and all words which fail to reproduce the 
thought and words which are repetitions in the pupil's paragraph 
are cast out. For example, if he has written fifty words, six of 
which are cast out, his reproduction score is 44. The two scores 
are then averaged and the result set down as that pupil's score in 
reading. 

Both Thorndike and Starch have supplementary vocabulary 
tests to accompany the tests for reading. The Thorndike reading 
scale uses sentences instead of paragraphs such as Starch uses. 
Instead of asking the pupil to reproduce the thought of the para- 
graph, a series of questions is put to test his comprehension. This 
scale does not test for speed. 

The third of the reading tests is that devised by F. J. Kelly and 
is called the Kansas silent-reading test. The scale consists of a 
series of short paragraphs of increasing difficulty, each one calling 
for some simple action as a test of comprehension. The pupil is 
given five minutes to read and respond to as many of these as he 
can. His standing is the sum of the credits allowed (the harder 
paragraphs are allowed more credits) for those paragraphs to 



WEIGHING THE SCALES 187 

which he has responded correctly. There are sixteen paragraphs 
for Grades III, IV, and V; sixteen for grades VI, VII, and VIII; 
and another sixteen for Grades IX, X, XI, XII. The eighth grade 
may also be tested on the last group. 

THE SPELLING SCALES 

There are two short spelling tests — the Buckingham and the 
Ayres short list. The Buckingham is made of two lists of words, 
twenty-five common words of known difficulty in each. The 
Ayres short fist has ten words per grade, suited to Grades II-VIII. 
While these are good for quick tests, the chance of individual pupils 
knowing the words would make the short lists undesirable for a 
final test. 

The longer Ayres scale contains a thousand words arranged in 
order of difficulty to suit the school grades from the second to the 
eighth. The grading is in percentages based on the number of 
misspelled words. 

The Starch scale consists of six lists of 100 each of representative, 
non-technical words chosen systematically from the whole English 
vocabulary. All the pupils in the eight grades are tested on one of 
the lists on a given day. The first grade is held for only the 40 
easy words at the head of the list, the second for the first 65 words, 
the third for the first 80, the fourth for 90, and the other grades for 
all the words. The following day another list of 100 is given to 
check the accuracy of the first day's test. The six lists are given 
to provide variety in case the test should be repeated from time to 
time. 

THE GRAMMAR TEST 

The Starch test in grammar recognizes the fact that knowing 
grammatical nomenclature and the ability to analyze sentences 
are of very much less importance than the ability to use the English 
language correctly. The scale does, however, provide tests both 
for ability to use the language correctly and for a knowledge of 
formal grammar. The first part of the usage test has groups of 
sentences (usually four in a group) graded from a value of 5 to a 
value of 16. Two forms are given for each sentence, and the child 
is asked to choose the correct one. The second part of the usage 



188 THE ENGLISH JOURNAL 

scale has six groups of four sentences each, graded from a quality 
marked 7 to one marked 12. A third group covers other points 
with five groups of qualities from 7 to n. 

The punctuation test consists of ten groups of sentences graded 
in qualities from 6 to 16. Most of the sentences are taken from the 
punctuation exercises in Woolley's Handbook of Composition. 

Following these tests of ability to speak and write the language 
correctly there are three grammar tests. The first is on the recog- 
nition of parts of speech, the second on recognizing and naming 
case forms, and the third on the recognition and naming of modes 
and tenses of verbs. Groups of sentences and paragraphs are given 
in each of these three grammar tests as the material from which 
the parts of speech, the case forms, and the verb forms are to be 
chosen. 

Upon the whole, the test seems a satisfactory one. If a teacher 
wished to test the relative knowledge of usage and grammatical 
terms of the pupils in a given grade, this test would furnish the 
necessary machinery. Or if a superintendent wished to compare 
the seventh grades of his system or his schools with those of other 
cities, this standard test would be found to be a very convenient 
tool to use. Dr. Starch recognizes that the scale is not complete 
for grammatical terms and that an added section for analyzing 
and diagramming is desirable; but this seems less important to me 
than he makes it. 

THE COMPOSITION TESTS 

The Hillegas-Thorndike scale is similar in construction to the 
penmanship scales of Ayres and Thorndike. Fifteen compositions 
ranging in quality from o to 95 have been chosen. The child who is 
being tested writes a short composition, using fifteen minutes to 
do it. Now, this composition is compared with one after another 
of the samples in the scale until one is found of approximately the 
same value. It would seem that there is too much room for indi- 
vidual opinion in judging here; but in the experiments I have made 
with the scale, having a number of persons read the same composi- 
tion and then grade it by the Hillegas scale, the results were much 
more nearly uniform than I expected. 



WEIGHING THE SCALES 189 

The Harvard-Newton (Ballou) scale is applicable only to upper- 
grade and high-school compositions. Like the Hillegas scale, it 
compares the pupil's compositions with a series of compositions 
arranged in a scale of values. But here there are four sets of com- 
positions used — one for each of the four commonly recognized types 
of writing. The descriptive group has six qualities ranging from 
95 per cent to 45 per cent. The expository has six ranging from 
91 . 8 per cent to 39.1 per cent. The narrative group has six 
ranging from 93 . 5 per cent to 46 . 9 per cent. The argumentative 
group has its six ranging from 93 . 2 per cent to 47 per cent. These 
very accurate percentages were obtained by averaging the marks 
of the twenty-four readers who graded the original sets. In using 
the scale these minute distinctions are discarded, and "A," "B," 
"C," "D," "E," and "F" are used instead. 

THE COURTIS TESTS 

The Courtis tests cover handwriting, English, composition, 
spelling, punctuation, and grammar, rates of reading and writing, 
and rates of reproduction. Dr. Courtis uses the Thorndike and 
Ayres scales in writing and the Rice scale for composition. Courtis 
gives a complete bibliography of the magazine articles upon the 
tests in his Manual of Instructions for giving and scoring the Courtis 
standard tests. 

I have not had an opportunity to examine Mr. M. A. Brown's 
The Measurement of Ability to Read. 

It seems to me to be a desirable thing to have a working set 
of educational scales for use in the schools. As it is at present the 
accomplishment of the sixth grade in one part of a city may be 
very much below that generally expected in the city, or the require- 
ments of one town in a given grade may be decidedly different from 
those of other towns. The problems of promotion and of transfer 
from one school to another would be very simple if pupils were 
tested by a set of the standard scales. 

The most valuable feature, however, would be that a teacher 
would be equipped with a means of testing the progress of the 
individual pupils in her room from time to time during the school 
year, and of testing the progress of the whole grade as well. As it 



190 THE ENGLISH JOURNAL 

is at present, with our imperfect methods of judging the progress 
of pupils, we are never quite sure that they are moving along. We 
have no means (outside of our impressions, our hopes, our enthusi- 
asms, our depressions, our likes or dislikes) of judging either the 
rate of progress or the total advancement made by any pupil. 
Scales, even though imperfect, are a great aid to accuracy in these 
matters. 

Improvement is no doubt possible in all the scales. The scales 
now in use will be made more nearly accurate from time to time, 
and scales will be devised for some other of the school subjects. 
The direction which the improvements need to take is toward 
greater accuracy and greater simplicity. These improvements 
will enable the mass of teachers the country over to use the scales 
for the practical purposes of the schoolroom. As the tests are at 
present in the experimental state, lacking uniformity, and fairly 
complex, their usefulness is limited to the few teachers who are 
willing to do the experimental and pioneering work of the school- 
room. But in time I confidently expect the tests to be perfected 
to such an extent that when the next inspector of weights and 
measures, say eight or ten years from now, attempts to weigh the 
scales, they will be found in no way wanting. 

BIBLIOGRAPHY 

An exposition of each of these sets of tests with the exercises 
for the scales may be found in Dr. Daniel Starch's Educational 
Measurements (Macmillan, 1916). This book does not contain 
the extension of Professor Thorndike's reading scale, published in 
the Teachers College Record for November, 1915, and January, 
1916. Dr. Starch's book contains also a detailed bibliography, 
fairly complete, of the monographs and articles dealing with each 
of these scales. In addition to his list I would mention Dr. Kayfetz' 
reviews of the Hillegas-Thorndike and the Harvard-Newton 
(Ballou) scales for written composition, the first in Pedagogical 
Seminary for December, 1914, and the latter in the same journal 
for September, 1916. The Measurement of Ability to Read, by 
Mr. M. A. Brown, is Bulletin No. 1 of the Bureau of Research, 
New Hampshire Department of Education. 



WEIGHING THE SCALES 191 

Blanks for making the tests and recording the results for all the 
Starch tests may be obtained at a small expense from Dr. Daniel 
Starch, University of Wisconsin, Madison, Wisconsin. The 
Ayres scales may be had from Leonard P. Ayres, director of the 
Russell Sage Foundation, Division of Education, New York City. 
Blanks and forms for the Thorndike-Hillegas scales may be obtained 
from Teachers College, Columbia University. The Harvard- 
Newton Composition Scale is published by Harvard University, 
Cambridge, Massachusetts; The Courtis Tests, by S. A. Courtis, 82 
Elliot Street, Detroit; and The Kansas Silent-Reading Test, by 
Professor F. J. Kelly, State Normal School, Emporia, Kansas. 
In most cases the authors have prepared all the printed sheets 
necessary for making the tests and have them for sale at about the 
cost of printing. 



